Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: use Cubic CC by default #2295

Merged
merged 7 commits into from
Jan 26, 2025
Merged

feat: use Cubic CC by default #2295

merged 7 commits into from
Jan 26, 2025

Conversation

mxinden
Copy link
Collaborator

@mxinden mxinden commented Dec 21, 2024

Firefox uses Cubic by default:

- name: network.http.http3.cc_algorithm
  type: RelaxedAtomicUint32
  value: 1
  mirror: always
  rust: true

https://searchfox.org/mozilla-central/rev/f9517009d8a4946dbdd3acd72a31dc34fca79586/modules/libpref/init/StaticPrefList.yaml

This commit updates Neqo to use Cubic instead of New Reno by default.

Firefox uses Cubic by default:

``` yaml
- name: network.http.http3.cc_algorithm
  type: RelaxedAtomicUint32
  value: 1
  mirror: always
  rust: true
```

https://searchfox.org/mozilla-central/rev/f9517009d8a4946dbdd3acd72a31dc34fca79586/modules/libpref/init/StaticPrefList.yaml

This commit updates Neqo to use Cubic instead of New Reno by default.
Copy link

github-actions bot commented Dec 21, 2024

Failed Interop Tests

QUIC Interop Runner, client vs. server, differences relative to 108fb8d.

neqo-latest as client

neqo-latest as server

All results

Succeeded Interop Tests

QUIC Interop Runner, client vs. server

neqo-latest as client

neqo-latest as server

Unsupported Interop Tests

QUIC Interop Runner, client vs. server

neqo-latest as client

neqo-latest as server

@mxinden
Copy link
Collaborator Author

mxinden commented Jan 5, 2025

Cubic does not seem to grow the cwnd as fast as New Reno in congestion avoidance phase. More particularly, while our New Reno implementation increases its cwnd in each iteration by the SMSS, our Cubic implementation only does so in every second iteration.

expected_cwnd += client.plpmtu();
assert_eq!(cwnd(&client), expected_cwnd);

That is surprising to me. I expected Cubic to grow faster than New Reno in the beginning, given its concave increase early in congestion avoidance.

I will give this more thought.

@larseggert
Copy link
Collaborator

Cubic does not seem to grow the cwnd as fast as New Reno in congestion avoidance phase.

That seems like a bug: https://datatracker.ietf.org/doc/html/rfc9438#section-1-5

Specifically, CUBIC may increase the congestion window more aggressively than Reno during the congestion avoidance phase. According to [RFC5681], during congestion avoidance, the sender must not increment cwnd by more than Sender Maximum Segment Size (SMSS) bytes once per round-trip time (RTT), whereas CUBIC may increase cwnd much more aggressively.

@mxinden
Copy link
Collaborator Author

mxinden commented Jan 13, 2025

My assessment above is wrong for the following reason:

Cubic does not seem to grow the cwnd as fast as New Reno in congestion avoidance phase.

Wrong.

More particularly, while our New Reno implementation increases its cwnd in each iteration by the SMSS, our Cubic implementation only does so in every second iteration.

Correct.

BUT, while New Reno halves its congestion window on a congestion event:

(curr_cwnd / 2, acked_bytes / 2)

Cubic will reduce it by 30% only:

// CUBIC_BETA = 0.7;
pub const CUBIC_BETA_USIZE_DIVIDEND: usize = 7;
pub const CUBIC_BETA_USIZE_DIVISOR: usize = 10;

curr_cwnd * CUBIC_BETA_USIZE_DIVIDEND / CUBIC_BETA_USIZE_DIVISOR,
acked_bytes * CUBIC_BETA_USIZE_DIVIDEND / CUBIC_BETA_USIZE_DIVISOR,

Thus, within the 5 iterations of the test, Cubic does not grow its congestion window as fast as New Reno, because it starts off with a larger congestion window after the congestion event, i.e. has a head start.

@larseggert
Copy link
Collaborator

Aside: Is there any point at having CUBIC_BETA_USIZE_DIVIDEND and CUBIC_BETA_USIZE_DIVISOR vs. just CUBIC_BETA = 0.7? Are we ever using just one of them without the division?

@mxinden
Copy link
Collaborator Author

mxinden commented Jan 13, 2025

Aside: Is there any point at having CUBIC_BETA_USIZE_DIVIDEND and CUBIC_BETA_USIZE_DIVISOR vs. just CUBIC_BETA = 0.7? Are we ever using just one of them without the division?

I assume the split is due to its usage with the usizes curr_cwnd and acked_bytes. The split in dividend and divisor allows for integer multiplication of a usize with a fraction without a conversion from/to a floating point.

curr_cwnd * CUBIC_BETA_USIZE_DIVIDEND / CUBIC_BETA_USIZE_DIVISOR,
acked_bytes * CUBIC_BETA_USIZE_DIVIDEND / CUBIC_BETA_USIZE_DIVISOR,

I don't have an opinion on whether it is worth the complexity it introduces. Given our work ahead (#1912) I suggest we keep it as is for now.

Copy link

codecov bot commented Jan 13, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 95.31%. Comparing base (e006a7d) to head (2dbc210).
Report is 2 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2295      +/-   ##
==========================================
+ Coverage   95.29%   95.31%   +0.01%     
==========================================
  Files         114      114              
  Lines       36868    36868              
  Branches    36868    36868              
==========================================
+ Hits        35135    35141       +6     
+ Misses       1727     1721       -6     
  Partials        6        6              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@mxinden mxinden marked this pull request as ready for review January 13, 2025 17:50
@mxinden
Copy link
Collaborator Author

mxinden commented Jan 14, 2025

Curious what the benchmark results will be.

@mxinden mxinden enabled auto-merge January 15, 2025 20:10
Copy link

Benchmark results

Performance differences relative to e006a7d.

decode 4096 bytes, mask ff: No change in performance detected.
       time:   [11.834 µs 11.865 µs 11.905 µs]
       change: [-0.8811% -0.4026% +0.0021%] (p = 0.08 > 0.05)

Found 12 outliers among 100 measurements (12.00%)
4 (4.00%) low severe
2 (2.00%) low mild
6 (6.00%) high severe

decode 1048576 bytes, mask ff: No change in performance detected.
       time:   [2.8828 ms 2.8906 ms 2.8999 ms]
       change: [-0.8223% -0.3308% +0.1434%] (p = 0.19 > 0.05)

Found 9 outliers among 100 measurements (9.00%)
1 (1.00%) low mild
1 (1.00%) high mild
7 (7.00%) high severe

decode 4096 bytes, mask 7f: No change in performance detected.
       time:   [19.746 µs 19.788 µs 19.835 µs]
       change: [-0.3915% -0.0412% +0.3512%] (p = 0.84 > 0.05)

Found 15 outliers among 100 measurements (15.00%)
3 (3.00%) low mild
2 (2.00%) high mild
10 (10.00%) high severe

decode 1048576 bytes, mask 7f: No change in performance detected.
       time:   [5.0724 ms 5.0835 ms 5.0960 ms]
       change: [-0.2659% +0.0593% +0.3919%] (p = 0.73 > 0.05)

Found 15 outliers among 100 measurements (15.00%)
15 (15.00%) high severe

decode 4096 bytes, mask 3f: No change in performance detected.
       time:   [6.9124 µs 6.9494 µs 6.9911 µs]
       change: [-0.0053% +0.4004% +0.9021%] (p = 0.08 > 0.05)

Found 16 outliers among 100 measurements (16.00%)
4 (4.00%) low mild
3 (3.00%) high mild
9 (9.00%) high severe

decode 1048576 bytes, mask 3f: No change in performance detected.
       time:   [1.4152 ms 1.4195 ms 1.4251 ms]
       change: [-0.5778% +0.0009% +0.5856%] (p = 0.99 > 0.05)

Found 4 outliers among 100 measurements (4.00%)
4 (4.00%) high severe

coalesce_acked_from_zero 1+1 entries: No change in performance detected.
       time:   [98.771 ns 99.125 ns 99.477 ns]
       change: [-0.6985% -0.2234% +0.2346%] (p = 0.35 > 0.05)

Found 12 outliers among 100 measurements (12.00%)
3 (3.00%) high mild
9 (9.00%) high severe

coalesce_acked_from_zero 3+1 entries: No change in performance detected.
       time:   [116.51 ns 116.75 ns 117.03 ns]
       change: [-0.6398% -0.2113% +0.1820%] (p = 0.32 > 0.05)

Found 14 outliers among 100 measurements (14.00%)
2 (2.00%) low severe
1 (1.00%) low mild
5 (5.00%) high mild
6 (6.00%) high severe

coalesce_acked_from_zero 10+1 entries: No change in performance detected.
       time:   [116.17 ns 116.50 ns 116.92 ns]
       change: [-0.7111% -0.2069% +0.2198%] (p = 0.41 > 0.05)

Found 12 outliers among 100 measurements (12.00%)
7 (7.00%) low severe
1 (1.00%) low mild
4 (4.00%) high severe

coalesce_acked_from_zero 1000+1 entries: No change in performance detected.
       time:   [97.802 ns 97.917 ns 98.051 ns]
       change: [-1.2613% -0.2637% +0.6844%] (p = 0.62 > 0.05)

Found 11 outliers among 100 measurements (11.00%)
5 (5.00%) high mild
6 (6.00%) high severe

RxStreamOrderer::inbound_frame(): Change within noise threshold.
       time:   [111.34 ms 111.38 ms 111.43 ms]
       change: [-0.4256% -0.3652% -0.3042%] (p = 0.00 < 0.05)

Found 13 outliers among 100 measurements (13.00%)
4 (4.00%) low mild
8 (8.00%) high mild
1 (1.00%) high severe

SentPackets::take_ranges: No change in performance detected.
       time:   [5.4828 µs 5.6345 µs 5.7947 µs]
       change: [-15.342% -4.0164% +4.2355%] (p = 0.64 > 0.05)

Found 7 outliers among 100 measurements (7.00%)
6 (6.00%) high mild
1 (1.00%) high severe

transfer/pacing-false/varying-seeds: Change within noise threshold.
       time:   [42.882 ms 42.963 ms 43.044 ms]
       change: [+2.8088% +3.0786% +3.3434%] (p = 0.00 < 0.05)

Found 2 outliers among 100 measurements (2.00%)
1 (1.00%) low mild
1 (1.00%) high mild

transfer/pacing-true/varying-seeds: Change within noise threshold.
       time:   [42.934 ms 43.004 ms 43.075 ms]
       change: [+2.5597% +2.7924% +3.0366%] (p = 0.00 < 0.05)

Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high mild

transfer/pacing-false/same-seed: 💔 Performance has regressed.
       time:   [42.714 ms 42.797 ms 42.879 ms]
       change: [+3.1645% +3.4224% +3.6685%] (p = 0.00 < 0.05)

Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) low mild

transfer/pacing-true/same-seed: Change within noise threshold.
       time:   [42.941 ms 43.012 ms 43.085 ms]
       change: [+1.9150% +2.1313% +2.3645%] (p = 0.00 < 0.05)

Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high mild

1-conn/1-100mb-resp/mtu-1504 (aka. Download)/client: No change in performance detected.
       time:   [900.42 ms 909.73 ms 919.34 ms]
       thrpt:  [108.77 MiB/s 109.92 MiB/s 111.06 MiB/s]
change:
       time:   [-2.3569% -0.8115% +0.6345%] (p = 0.28 > 0.05)
       thrpt:  [-0.6305% +0.8181% +2.4138%]
1-conn/10_000-parallel-1b-resp/mtu-1504 (aka. RPS)/client: No change in performance detected.
       time:   [315.05 ms 317.21 ms 319.44 ms]
       thrpt:  [31.305 Kelem/s 31.525 Kelem/s 31.741 Kelem/s]
change:
       time:   [-0.9305% +0.0081% +0.9152%] (p = 0.99 > 0.05)
       thrpt:  [-0.9069% -0.0081% +0.9392%]

Found 4 outliers among 100 measurements (4.00%)
4 (4.00%) high mild

1-conn/1-1b-resp/mtu-1504 (aka. HPS)/client: 💚 Performance has improved.
       time:   [33.863 ms 34.036 ms 34.223 ms]
       thrpt:  [29.220  elem/s 29.380  elem/s 29.531  elem/s]
change:
       time:   [-3.3477% -2.6772% -1.9981%] (p = 0.00 < 0.05)
       thrpt:  [+2.0388% +2.7509% +3.4637%]

Found 4 outliers among 100 measurements (4.00%)
2 (2.00%) high mild
2 (2.00%) high severe

1-conn/1-100mb-resp/mtu-1504 (aka. Upload)/client: No change in performance detected.
       time:   [1.6592 s 1.6762 s 1.6933 s]
       thrpt:  [59.056 MiB/s 59.659 MiB/s 60.268 MiB/s]
change:
       time:   [-2.9356% -1.4871% +0.0531%] (p = 0.05 > 0.05)
       thrpt:  [-0.0531% +1.5095% +3.0244%]

Client/server transfer results

Transfer of 33554432 bytes over loopback.

Client Server CC Pacing MTU Mean [ms] Min [ms] Max [ms]
gquiche gquiche 1504 567.6 ± 98.1 511.8 791.0
neqo gquiche reno on 1504 795.6 ± 73.9 735.0 946.3
neqo gquiche reno 1504 754.4 ± 66.4 708.5 939.1
neqo gquiche cubic on 1504 764.1 ± 13.4 736.5 781.8
neqo gquiche cubic 1504 764.5 ± 18.8 744.0 810.4
msquic msquic 1504 163.1 ± 93.7 93.1 379.4
neqo msquic reno on 1504 260.7 ± 79.4 208.0 453.2
neqo msquic reno 1504 205.9 ± 6.3 198.1 219.1
neqo msquic cubic on 1504 218.2 ± 12.7 203.2 251.6
neqo msquic cubic 1504 280.1 ± 135.1 210.8 693.9
gquiche neqo reno on 1504 686.2 ± 92.7 555.4 858.1
gquiche neqo reno 1504 723.3 ± 147.0 549.5 1065.5
gquiche neqo cubic on 1504 703.3 ± 89.1 569.0 817.4
gquiche neqo cubic 1504 698.4 ± 91.4 557.7 806.9
msquic neqo reno on 1504 491.7 ± 9.1 479.0 503.8
msquic neqo reno 1504 495.9 ± 47.7 464.5 585.2
msquic neqo cubic on 1504 499.1 ± 42.0 464.9 587.0
msquic neqo cubic 1504 552.7 ± 111.5 451.7 696.1
neqo neqo reno on 1504 487.2 ± 46.0 448.8 603.7
neqo neqo reno 1504 459.2 ± 12.4 439.5 474.3
neqo neqo cubic on 1504 448.4 ± 8.5 436.0 467.4
neqo neqo cubic 1504 495.0 ± 49.0 453.3 590.8

⬇️ Download logs

@mxinden mxinden added this pull request to the merge queue Jan 26, 2025
Merged via the queue into mozilla:main with commit 6225352 Jan 26, 2025
63 of 66 checks passed
@mxinden mxinden deleted the cubic branch January 26, 2025 12:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants